2,307 research outputs found
Towards an Adaptive Skeleton Framework for Performance Portability
The proliferation of widely available, but very different, parallel architectures
makes the ability to deliver good parallel performance
on a range of architectures, or performance portability, highly desirable.
Irregularly-parallel problems, where the number and size
of tasks is unpredictable, are particularly challenging and require
dynamic coordination.
The paper outlines a novel approach to delivering portable parallel
performance for irregularly parallel programs. The approach
combines declarative parallelism with JIT technology, dynamic
scheduling, and dynamic transformation.
We present the design of an adaptive skeleton library, with a task
graph implementation, JIT trace costing, and adaptive transformations.
We outline the architecture of the protoype adaptive skeleton
execution framework in Pycket, describing tasks, serialisation,
and the current scheduler.We report a preliminary evaluation of the
prototype framework using 4 micro-benchmarks and a small case
study on two NUMA servers (24 and 96 cores) and a small cluster
(17 hosts, 272 cores). Key results include Pycket delivering good
sequential performance e.g. almost as fast as C for some benchmarks;
good absolute speedups on all architectures (up to 120 on
128 cores for sumEuler); and that the adaptive transformations do
improve performance
Costing JIT Traces
Tracing JIT compilation generates units of compilation that
are easy to analyse and are known to execute frequently. The AJITPar
project aims to investigate whether the information in JIT traces can be
used to make better scheduling decisions or perform code transformations
to adapt the code for a specific parallel architecture. To achieve this goal,
a cost model must be developed to estimate the execution time of an
individual trace.
This paper presents the design and implementation of a system for extracting
JIT trace information from the Pycket JIT compiler. We define
three increasingly parametric cost models for Pycket traces. We perform
a search of the cost model parameter space using genetic algorithms to
identify the best weightings for those parameters. We test the accuracy
of these cost models for predicting the cost of individual traces on a set
of loop-based micro-benchmarks. We also compare the accuracy of the
cost models for predicting whole program execution time over the Pycket
benchmark suite. Our results show that the weighted cost model
using the weightings found from the genetic algorithm search has the
best accuracy
Reliable scalable symbolic computation: The design of SymGridPar2
Symbolic computation is an important area of both Mathematics and Computer Science, with many large computations that would benefit from parallel execution. Symbolic computations are, however, challenging to parallelise as they have complex data and control structures, and both dynamic and highly irregular parallelism. The SymGridPar framework (SGP) has been developed to address these challenges on small-scale parallel architectures. However the multicore revolution means that the number of cores and the number of failures are growing exponentially, and that the communication topology is becoming increasingly complex. Hence an improved parallel symbolic computation framework is required.
This paper presents the design and initial evaluation of SymGridPar2 (SGP2), a successor to SymGridPar that is designed to provide scalability onto 10^5 cores, and hence also provide fault tolerance. We present the SGP2 design goals, principles and architecture. We describe how scalability is achieved using layering and by allowing the programmer to control task placement. We outline how fault tolerance is provided by supervising remote computations, and outline higher-level fault tolerance abstractions.
We describe the SGP2 implementation status and development plans. We report the scalability and efficiency, including weak scaling to about 32,000 cores, and investigate the overheads of tolerating faults for simple symbolic computations
The HdpH DSLs for scalable reliable computation
The statelessness of functional computations facilitates both parallelism and fault recovery. Faults and non-uniform communication topologies are key challenges for emergent large scale parallel architectures. We report on HdpH and HdpH-RS, a pair of Haskell DSLs designed to address these challenges for irregular task-parallel computations on large distributed-memory architectures. Both DSLs share an API combining explicit task placement with sophisticated work stealing. HdpH focuses on scalability by making placement and stealing topology aware whereas HdpH-RS delivers reliability by means of fault tolerant work stealing.
We present operational semantics for both DSLs and investigate conditions for semantic equivalence of HdpH and HdpH-RS programs, that is, conditions under which topology awareness can be transparently traded for fault tolerance. We detail how the DSL implementations realise topology awareness and fault tolerance. We report an initial evaluation of scalability and fault tolerance on a 256-core cluster and on up to 32K cores of an HPC platform
JIT costing adaptive skeletons for performance portability
The proliferation of widely available, but very different, parallel architectures makes the ability to deliver good parallel performance on a range of architectures, or performance portability, highly desirable. Irregular parallel problems, where the number and size of tasks is unpredictable, are particularly challenging and require dynamic coordination.
The paper outlines a novel approach to delivering portable parallel performance for irregular parallel programs. The approach combines JIT compiler technology with dynamic scheduling and dynamic transformation of declarative parallelism.
We specify families of algorithmic skeletons plus equations for rewriting skeleton expressions. We present the design of a framework that unfolds skeletons into task graphs, dynamically schedules tasks, and dynamically rewrites skeletons, guided by a lightweight JIT trace-based cost model, to adapt the number and granularity of tasks for the architecture.
We outline the system architecture and prototype implementation in Racket/Pycket. As the current prototype does not yet automatically perform dynamic rewriting we present results based on manual offline rewriting, demonstrating that (i) the system scales to hundreds of cores given enough parallelism of suitable granularity, and (ii) the JIT trace cost model predicts granularity accurately enough to guide rewriting towards a good adaptive transformation
A lattice-theoretic framework for circular assume-guarantee reasoning
We develop an abstract lattice-theoretic framework within which we study soundness and other properties of circular assume-guarantee (A-G) rules constrained by side conditions. We identify a particular side condition, non-blockingness, which admits an intelligible inductive proof of the soundness of circular A-G reasoning. Besides, conditional circular rules based on non-blockingness turn out to be complete in various senses and stronger than a large class of sound conditional A-G rules. In this respect, our framework enlightens the foundations of circular A-G reasoning. Due to its abstractness, the framework can be instantiated to many concrete settings. We show several known circular A-G rules for compositional verification to be instances of our generic rules. Thus, we do the circularity-breaking inductive argument once to establish soundness of our generic rules, which then implies soundness of all the instances without resorting to technically complicated circularity-breaking arguments for each single rule. In this respect, our framework unifies many approaches to circular A-G reasoning and provides a starting point for the systematic development of new circular A-G rules.Wir entwickeln einen abstrakten verbandstheoretischen Rahmen in dem wir die
Korrektheit und andere Eigenschaften bedingter zirkulaerer Assume-Guarantee-
Regeln (A-G-Regeln) untersuchen. Wir isolieren eine besondere Nebenbedingung,
non-blockingness, die zu einem verstaendlichen induktiven Beweis der Korrektheit
zirkulaerer A-G-Regeln fuehrt. Ausserdem sind durch non-blockingness eingeschr
aenkte zirkulaere Regeln vollstaendig und staerker als eine grosse Klasse von
korrekten bedingten A-G-Regeln. So gesehen erhellt unsere Arbeit die Grundlagen
des zirkulaeren A-G-Paradigmas.Aufgrund seiner Abstraktheit kann unser Rahmen zu vielen konkreten Formalismen instanziiert werden. Wir zeigen, dass mehrere bekannte A-G-Regeln zur kompositionalen Verifikation Instanzen unserer generischen Regeln sind. So ist der zirkularitaetsaufloesende Beweis der Korrektheit nur einmal fuer unsere generische Regeln zu fuehren, dann erben alle Instanzen Korrektheit, ohne dass noch einmal ein zirkularitaets-aufloesender Beweis noetig ist. In dieser Hinsicht stellt unser Rahmen eine einheitliche Plattform dar, die verschiedene Ausformungen des
zirkulaeren A-G-Paradigmas umfasst und von der ausgehend systematisch neue
zirkulaere A-G-Regeln entwickelt werden koennen
Entwicklung einer Methode zur Objektivierung der subjektiven Wahrnehmung von antriebsstrangerregten Fahrzeugschwingungen : Development of a method to predict discomfort by powertrain-induced vehicle vibrations
In der vorliegenden Arbeit wird eine Methode zur Objektivierung der subjektiven Wahrnehmung von antriebsstrangerregten Fahrzeugschwingungen entwickelt. Auf Basis der Korrelation mit dem Subjektivurteil werden Auswertepunkte, Schwingungsrichtungen und Auswerte-Algorithmen analysiert und die Methode abgeleitet. Es erfolgt eine Verifizierung der zentralen Hypothese der Arbeit. Abschließend wird die Anwendbarkeit im Handlungssystem eines Fahrzeugherstellers beschrieben und diskutiert
A lattice-theoretic framework for circular assume-guarantee reasoning
We develop an abstract lattice-theoretic framework within which we study soundness and other properties of circular assume-guarantee (A-G) rules constrained by side conditions. We identify a particular side condition, non-blockingness, which admits an intelligible inductive proof of the soundness of circular A-G reasoning. Besides, conditional circular rules based on non-blockingness turn out to be complete in various senses and stronger than a large class of sound conditional A-G rules. In this respect, our framework enlightens the foundations of circular A-G reasoning. Due to its abstractness, the framework can be instantiated to many concrete settings. We show several known circular A-G rules for compositional verification to be instances of our generic rules. Thus, we do the circularity-breaking inductive argument once to establish soundness of our generic rules, which then implies soundness of all the instances without resorting to technically complicated circularity-breaking arguments for each single rule. In this respect, our framework unifies many approaches to circular A-G reasoning and provides a starting point for the systematic development of new circular A-G rules.Wir entwickeln einen abstrakten verbandstheoretischen Rahmen in dem wir die
Korrektheit und andere Eigenschaften bedingter zirkulaerer Assume-Guarantee-
Regeln (A-G-Regeln) untersuchen. Wir isolieren eine besondere Nebenbedingung,
non-blockingness, die zu einem verstaendlichen induktiven Beweis der Korrektheit
zirkulaerer A-G-Regeln fuehrt. Ausserdem sind durch non-blockingness eingeschr
aenkte zirkulaere Regeln vollstaendig und staerker als eine grosse Klasse von
korrekten bedingten A-G-Regeln. So gesehen erhellt unsere Arbeit die Grundlagen
des zirkulaeren A-G-Paradigmas.Aufgrund seiner Abstraktheit kann unser Rahmen zu vielen konkreten Formalismen instanziiert werden. Wir zeigen, dass mehrere bekannte A-G-Regeln zur kompositionalen Verifikation Instanzen unserer generischen Regeln sind. So ist der zirkularitaetsaufloesende Beweis der Korrektheit nur einmal fuer unsere generische Regeln zu fuehren, dann erben alle Instanzen Korrektheit, ohne dass noch einmal ein zirkularitaets-aufloesender Beweis noetig ist. In dieser Hinsicht stellt unser Rahmen eine einheitliche Plattform dar, die verschiedene Ausformungen des
zirkulaeren A-G-Paradigmas umfasst und von der ausgehend systematisch neue
zirkulaere A-G-Regeln entwickelt werden koennen
- …